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Abstract 

Background: Numerous models for use in interpreting quantitative PGR (qPGR) data are present in recent literature. 
The most commonly used models assume the amplification in qPGR is exponential and fit an exponential model 
with a constant rate of increase to a select part of the curve. Kinetic theory may be used to model the annealing 
phase and does not assume constant efficiency of amplification. Mechanistic models describing the annealing 
phase with kinetic theory offer the most potential for accurate interpretation of qPGR data. Even so, they have not 
been thoroughly investigated and are rarely used for interpretation of qPGR data. New results for kinetic modeling 
of qPGR are presented. 

Results: Two models are presented in which the efficiency of amplification is based on equilibrium solutions for 
the annealing phase of the qPGR process. Model 1 assumes annealing of complementary targets strands and 
annealing of target and primers are both reversible reactions and reach a dynamic equilibrium. Model 2 assumes all 
annealing reactions are nonreversible and equilibrium is static. Both models include the effect of primer 
concentration during the annealing phase. Analytic formulae are given for the equilibrium values of all single and 
double stranded molecules at the end of the annealing step. The equilibrium values are then used in a stepwise 
method to describe the whole qPGR process. Rate constants of kinetic models are the same for solutions that are 
identical except for possibly having different initial target concentrations. Analysis of qPGR curves from such 
solutions are thus analyzed by simultaneous non-linear curve fitting with the same rate constant values applying to 
all curves and each curve having a unique value for initial target concentration. The models were fit to two data 
sets for which the true initial target concentrations are known. Both models give better fit to observed qPGR data 
than other kinetic models present in the literature. They also give better estimates of initial target concentration. 
Model 1 was found to be slightly more robust than model 2 giving better estimates of initial target concentration 
when estimation of parameters was done for qPGR curves with very different initial target concentration. Both 
models may be used to estimate the initial absolute concentration of target sequence when a standard curve is 
not available. 

Conclusions: It is argued that the kinetic approach to modeling and interpreting quantitative PGR data has the 
potential to give more precise estimates of the true initial target concentrations than other methods currently used 
for analysis of qPGR data. The two models presented here give a unified model of the qPGR process in that they 
explain the shape of the qPGR curve for a wide variety of initial target concentrations. 
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Background 

Quantitative Polymerase Chain Reaction, qPCR, has 
become a common tool of molecular biology to deter- 
mine the absolute or relative concentrations of particu- 
lar DNA sequences in samples. The method gives 
fluorescent values for each of a number of consecutive 
cycles beginning with cycle 1. Before cycle 1 the initial 
concentration of double-stranded DNA target sequence 
is To and its concentration is amplified in successive 
cycles to produce a high concentration of the target 
sequence. At each cycle, the efficiency of amplification 
(E) is the ratio of the amount of newly synthesized tar- 
get at the end of the cycle to the amount present at 
the beginning, thus the amount of target at the end of 
the cycle is (1 + E) times the amount at the beginning. 
If every single-stranded target molecule re-associates 
with exactly one primer molecule and all these struc- 
tures are extended by polymerase to completely 
synthesize the complementary target strand, the value 
of E is 1, which is its theoretical upper limit. The in- 
crease in DNA concentration is monitored by a detec- 
tion system which generates fluorescence in proportion 
to the concentration of target DNA sequence. Several 
difl'erent types of detection systems are in common 
usage that generate fluorescence through different 
types of reactions. I present here two models of qPCR 
for use when the detection systems uses DNA binding 
dyes such as SYBR green or SYTO-13. During early 
cycles of the amplification, the concentration of target 
is too small to produce measurable fluorescence and is 
called the lag phase (Figure 1). During lag phase the 
concentration of target sequence increases by (1 + E) 
fold in every cycle, but the initial concentration of the 
double-stranded target sequence (Tq) is so small that 
even with repeated increases of (1 + E) fold in every 
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Figure 1 A typical amplification curve resulting from a qPCR 
experiment. 



cycle the target concentration is not high enough to 
be measurable. Thus, no information on E can be 
obtained from the amplification curve during lag 
phase. When target is present in quantities sufficient 
to be measured, the increase in fluorescence is ap- 
proximately exponential over a number of successive 
cycles and the reaction is said to be in the exponential 
phase (Figure 1). Fluorescence values during the expo- 
nential phase may be used to estimate E and most 
models of qPCR use only this part of the curve to es- 
timate E and then assume the estimated value applies 
throughout the lag phase as well. After exponential 
phase, the efficiency progressively declines due to 
changes in the concentrations of reactants. Eventually 
the reaction enters stationary phase, during which E 
approaches zero and increases in fluorescence are min- 
imal (Figure 1). 

Models of qPCR assume total fluorescence, F, is due to 
two sources: Baseline fluorescence, ^F, and Target fluor- 
escence, tF> thus total fluorescence is F=i,F-\-tF' Base- 
line fluorescence is due to all sources other than target 
and is assumed to be constant throughout the qPCR ex- 
periment. Target fluorescence is generated by reaction of 
the detection system with target sequences and is gener- 
ated in proportion to target concentration. Thus, tF = 
KfT where T is the concentration of the double- stranded 
target and Kf is a constant that converts the target con- 
centration to fluorescence. Total fluorescence at the end 
of the n^^ cycle is: 



Fn=bF + KfTn 



(1) 



where is target concentration at the end of the reas- 
sociation/extension step of cycle n. At the beginning of 
the n^^ cycle, target DNA is present at concentration 
Tyi.i and is replicated with amplification efficiency 
Target concentration at the end of cycle n is then: 
Tyi = Tyi-i{l + Eyi) 
which has the solution: 



Tn = To\[{l^Et) n = 1,2,3,. 



(2) 



i=l 



Since jFn = KfT^ then r^H=r^o]^ (1 +^/) and 

i=l 

n 

Fn=bF+TFo\{ (1 + £.0 n = 1, 2, 3, . . . (3) 



where tFq is the amount of fluorescence that could be 
produced by the target sequence before cycle 1. Values 
of Eyi are defined as E^ = — 1 for f2 = 1, 2, . . . and if 
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equation 1 holds then £^ may be estimated from the 
observed fluorescence values as: 

En=p^^-l for f2 = 1,2,3, .... (4) 

Equation 4 is useful when F„ can be reliably distin- 
guished from ijF, which is possible only during exponen- 
tial phase and later. Models of qPCR describe how 
changes with n and use equation 2 to determine the tar- 
get concentration or equation 3 to determine target 
fluorescence at each step of the process. Some models of 
qPCR model ^F^ instead of and thus use the estimate 
of T^o instead of Tq [1]. The estimated t^o values can be 
converted to Tq by dividing by Kf when a value for Kf is 
available. The primary goal in analysis of qPCR data is 
to estimate Tq or t^o from the qPCR curve as accurately 
as possible. Frequently, two or more qPCR curves are 
analyzed with a goal of determining the value of the 
ratio of two Tq values when Kf is not known. The ratio 
of two Tq values is estimated by the ratio of two t^o 
values provided Kf values are the same in the two qPCR 
experiments. This principle is often used in qPCR ana- 
lyses to compare two or more samples by estimating the 
ratio of two Tq values by the ratio of two estimated t^o 
values [2-4]. 

Numerous approaches to modeling the qPCR process 
are present in recent literature and involve three differ- 
ent general approaches. First the Q and Linear Regres- 
sion approaches assume constant efficiency and estimate 
it by linear regression of ln(F„-^f) on cycle number 
using only values from the exponential phase of the 
curve (sometimes called the 'Window-of-Linearity ) [3,5- 
9] . Second, the sigmoidal function approach has used to 
fit a variety of sigmoidal functions to approximate the 
qPCR curve (Figure 1) and estimate t^o as the intercept 
of a sigmoidal function [1,10,11]. The sigmoidal func- 
tions used are described in the literature and none are 
based on a mechanistic model of the reactions in qPCR. 
Thirdly, kinetic models are a mechanistic approach to 
modeling the process and several such models have been 
proposed [8,12-14]. The work I present here extends the 
kinetic approach to modeling qPCR by including primer 
concentration in the model and providing analytical 
equilibrium solutions for the re-association phase. I also 
introduce the use of simultaneous analysis of qPCR 
curves which have common values for rate constants in 
the kinetic models. 

Current models 

The models presented here may be regarded as an ex- 
tension of the models presented by Smith et al (2007) 
[13] and Boggy and Woolf (2010) [14]. Though the mod- 
els presented here are similar to those of Smith et, al. 



(2007) [13] for hydrolysis probe detection systems, they 
differ in that they are designed for intercalating dye de- 
tection systems such as SYBR green. They also use ana- 
lytical solutions for equilibrium concentrations in the 
annealing step. One of the models is the same as that of 
Boggy and Woolf (2010) [14], however, a more general 
solution for equilibrium values for the annealing step is 
presented here that includes primer concentration and 
gives improved fit to data. Two stepwise kinetic equilib- 
rium models are presented in which efficiency of ampli- 
fication at each cycle depends on the concentrations of 
target and primers at that cycle. Events in a single cycle 
are modeled and then used in iterations to model the 
entire process. In each cycle, the re-association and the 
primer extension steps are modeled separately and con- 
secutively. Double-stranded target DNA is denoted 
Ai'A2 where Ai and A2 are complementary single 
strands. Primer that hybridizes with is denoted ai 
and primer for is denoted Primer- template hybrid 
molecules are denoted Ai-ai and A2 'a2, respectively. 

Re-association step 

At the end of the dissociation step of cycle n, the concen- 
tration of all double-stranded structures is zero and con- 
centrations of single-stranded molecules Ai, A2, ai, a2 are 
Slyi,o, S2yi^0i Pln,o and P2yi^0i respectively, with units of con- 
centration being moles/liter. At the end of the re- 
association step, the concentrations of Ai, A2, ai, a2, 
Ai'Ui A2'a2 and Aj'A2 are assumed to be at their 

equilibrium values Slj^^^y ^^n,e> P^n,e> P^n,e^ Qln,e^ Q2n,ey 

and D^e, respectively. Two different kinetic models are 
each used to obtain these equilibrium concentrations as 
a function of the initial concentrations Slyj^o> ^2n,o> Pln,o 
and P2yi^0' The initial target is entirely double- stranded 
at the beginning of cycle 1, so the two complementary 
target strands, Ai and A2, are initially present in equal 
concentrations. Primers, ai and ^2 are assumed to be 
present in equal concentrations at the beginning of 
cycle 1, equally effective in forming double-stranded 
structure with the target sequences, and equally effect- 
ive in initiating synthesis with the polymerase. Under 

these assumptions, Slyj^e = S2n,e = ^n,e> PJ^n,e = P2^,e = Pn,e> 

and Ql^^e = Q^n,e = Qn,e for all n (Table 1). 
Model 1 reversible re-association 

Reversible dissociation/re-association reactions and rate 
constants are shown in Figure 2 A. Formation oi Ai-ai 



Table 1 Summary of initial and equilibrium values for 
state variables of Model 1 and Model 2 
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Figure 2 Panel A: Dissociation/re-association reactions and rate 
constants for Model 1. Panel B: Differential equations for model 1. 



A2'a2 and Aj--A2 from Aj, A2, aj, a2 during the re- 
association step is assumed to follow the law of mass ac- 
tion and described by the rate equations given in 
Figure 2B. At equilibrium, all net rates are zero and 
equilibrium values of the state variables 5'^^^, P^^^, e 
and Dyi^e and are found as the simultaneous solution of 
equations in Figure 2B when each equation is set equal 
to zero. Through a series of substitutions and rearrange- 
ments of these equations (see Additional file 1), a cubic 
polynomial in Syi,e is obtained with coefficients contain- 
ing only rate constants and initial concentrations Sn,o 
and Pn,o and is given as equation A1.6 in Additional file 
1. The relevant root of the cubic equation is found using 
the cubic formula given in equation A1.7 of Additional 
file 1 and equilibrium values for the remaining state 
variables may be found by substituting the value of Sn,e 
into equations also given in Additional file 1. Thus S^^e, 
Pn,e^ Qn,e ^n,e ^^e expressed as a known function of 
the rate constants kai2) kdi2> ka> and ka and the initial 
concentrations S^^o and P^^o- 

Model 2 non-reversible re-association 

Non-reversible re-association reactions and rate con- 
stants are shown in Figure 3A. Reactions are again 
assumed to follow the law of mass action and rate 
equations are given in Figure 3B. At equilibrium, all 
net rates are zero; however, equilibrium values of the 
state variables S^^e^ Pn,e cannot be found as the simul- 
taneous solution of equations in Figure 3B when set 
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Figure 3 Panel A: Dissociation/re-association reactions and rate 
constants for Model 2. Panel B: Differential equations for model 2. 



equal to zero. Since /c^>0, kai2>0, >0 and >0 
then Syi^e = 0 is the only possible equilibrium solution 
and with 5'„,e = 0, any value of will satisfy the equi- 
librium equations. This result is also expected on 
purely scientific grounds as annealing without dissoci- 
ation will always eventually reduce the single strand 
template concentration to a concentration of zero. The 
equilibrium value, P^ ^, is found by expressing the rate 
of change in P„ as a function of 5'^ and then integrat- 
ing over Syi as it goes from 5'^ ^ to zero. The solutions 
for P^e, Qyie and are given as equations A2.4, 

A2.5 and A2.6, respectively, in Additional file 2 which 
are all known functions of /c^, kai2> Sn,o Pn,o (see 
Additional file 2). 

Extension step 

The proportion of target/primer duplexes Ai-Ui and 
A2'a2 that produce full length products in the primer 
extension step depends on the concentration of the 
polymerase and the concentration of dNTPs. If both 
are in sufficiently high concentration the proportions 
will be near 1, however, if either or both are suffi- 
ciently low the probabilities will be less than 1. Here it 
will be assumed there is always sufficient polymerase 
and dNTP concentration to ensure complete extension 
of all primers in target/primer duplexes. With this as- 
sumption, at the beginning of the next cycle the con- 
centrations of Ai and A2 are each given by 
equation 5.1 and the concentrations oi Ui and ^2 are 
each given by equation 5.2. 



«^«+l,0 — ^n,0 + Qn,e ^ — 1? 2, .... 



- w+1,0 



1,2, 



(5.1) 



(5.2) 
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The whole process 

The whole qPCR process is modeled by iteration of 
equations 5.1 and 5.2 with Q^ ^ determined by equation 
A1.9 for model 1 or equation A2.5 for model 2 (see 
Additional files 1 and 2). Before cycle 1, the concentra- 
tions of single-stranded target and primer are Sq^o and 
Po,o> respectively. The Efficiency of amplification for sin- 
gle stranded template at cycle n is {Sn+i,o l^n,o)''^ which 
may be obtained from equation 5.1 as 

En = {^A n = 1,2,3,... (6) 

and fluorescence at cycle n is found using equation 1 as 

Fn = bF^Kf{2Qn^e^Dn^e) n = 1, 2, 3, . . . (7) 

Since equation 7 is defined only for n = 1,2,3. • • it can- 
not be used directly to obtain a value of x^Qy however, 
fitting equation 7 to an actual qPCR curve provides an 
estimate of So,o which is the initial concentration of each 
oi Ai and If the initial target sample is all double- 
stranded then So,o is also the concentration of the 
double-stranded molecules, so Tq = So,o« Defining t^q as 
Kf^ao provides a measure of total initial target concen- 
tration in fluorescence units. This is the amount of 
fluorescence that would be due to the target if it is all 
double-stranded. If t^q could actually be measured it 
would be Kf Do^e where Do^e is the concentration of 
double-stranded target DNA before cycle 1. Most likely 
nearly all of the target is double-stranded before cycle 1 
but, since the total amount of target is what is desired 
using KfSo^o as the estimate of t^o is preferred. 

Relating Model 2 to Boggy & Woolf (2010) 

Model 2 annealing kinetics have been analyzed by Boggy 
& Woolf [14] who give an approximate equilibrium so- 
lution for the re-association step and use it to obtain a 
recurrence formula for double stranded target concen- 
tration over successive cycles. Their approximation is 
obtained by removing primer concentration from the 
model to obtain a simpler set of differential equations 
which they solve to get an approximate solution. I 
show in Additional file 3 that my solution for model 2 
becomes equivalent to theirs if ^^^z^ is small and their 

parameter k is set to (^ kJl-k^ Pn,e' This result indicates 

the rate constant, k, for the approximate solution 
found by Boggy & Woolf [14] depends on primer con- 
centration and ignoring primer concentration in the 
analysis will cause at least some error in parameter 
estimates. Boggy and Woolf (2010) [14] note that their 
parameter k varied from one analysis to another, a re- 
sult they did not expect, and this is likely due to varia- 
tions in the amount of primer available. This finding 



suggest the analysis for model 2 presented here is a 
more accurate solution for model 2 kinetics than that 
of Boggy and Woolf (2010) [14] process since it 
includes variation in primer concentration. 

Restricting model fit to only part of the qPCR curve 

Since models 1 & 2 both include the effect of declining 
primer concentration during PGR they predict a differ- 
ent shape of the qPCR curve between lag and stationary 
phase for solutions that differ in primer concentration. 
Consider two solutions, one with low So,o and one with 
high So,o and with identical primer concentrations. Allow 
the one with low So,o to undergo enough cycles so its 
current target concentration is equal to the initial target 
concentration for the solution with higher Sq^o ♦ If primer 
concentration is not limiting in any way then the two 
solutions would have identical qPCR curves from that 
point on. If primer concentration is limiting in some 
way then the qPCR curve for the solution with lower 
So,o will at that point rise more slowly than that for the 
solution with higher So,o due to lowered primer concen- 
tration. When fitting models 1 and 2 to data it is thus 
desirable to include as much of the exponential phase as 
possible to allow changing primer concentrations to 
affect the analysis. Including most of the exponential 
phase is desired, however, in late exponential phase 
effects not included in the model also begin to occur 
and may cause the model to have poor fit. Effects in late 
exponential phase not modeled include reduced amount 
of dNTPs, reduced amount of polymerase due to decay 
and possibly partially replicated templates. In order to 
restrict analyses to only part of the exponential phase, 
models were fit to qPCR curves for cycles in the lag and 
exponential phase for which (Fn - bF)/(Fmax - b?) ^ L> 
where and F^ax are baseline fluorescence and max- 
imum fluorescence achieved in stationary phase. The 
value, L, is the proportion of increase of Fn above bF and 
is a cut off value used to restrict the analysis to the first 
parts of the qPCR curve up to a point in the exponential 
phase where the model is thought to be valid. When 
choosing L, a compromise between these opposing 
effects is necessary and is done here by choosing the 
highest value for L that allows good fit to qPCR curves. 
To accomplish this, the models were fit to qPCR curves 
using a range of L values from 0.1 to 0.98 and the good- 
ness of fit and estimated initial concentration deter- 
mined for each L value. Both model 1 and 2 generally 
gave poor fits for L > 0.95. The value of 0.8 was chosen 
for presentation here because the MSresidual values for 
most qPCR curves increased for values of L higher than 
0.8 and also because the accuracy of estimates of initial 
target concentration was better for L values near 0.8. 
The question of how much of the qPCR curve to use in 
analysis is a problem inherent to most methods used for 
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interpretation of qPCR curves. Boggy and Woolf (2010) 
[14] choose the point of maximal increase in fluores- 
cence as their cut-off point for all curves which corre- 
sponds roughly to a value of 0.5 for L. Their rational for 
using only about half the exponential phase is that the 
effects of declining primer concentration are not likely 
to affect the shape of this part of the curve. The Q 
method uses a different cut-off for every curve which is 
invoked when the window-of-linearity' is chosen. 

Methods for incorporating experimental design into the 
data analysis 

Kinetic models of qPCR allow more sophisticated and 
powerful analyses than are possible with other models. If 
a group of solutions have exactly the same chemical 
components except for possibly differing in initial target 
concentration, then all the kinetic parameters should be 
the same for all the solutions. The rate constants may 
then be estimated by fitting several or many qPCR 
curves simultaneously to get better estimates of rate 
constants and initial target concentrations. In such fits 
the kinetic parameters are common to all the different 
qPCR curves in the analysis but each curve may have a 
unique value for initial target concentration. Also kinetic 
models such those presented here allow an analyst to ad- 
just the model to accommodate different experimental 
designs. I present four different estimation methods (A, 
B, C, and D) which are used here to fit models to qPCR 
data which have different experimental designs. 

Standard curve method (Method A) 

This method may be used only when the true initial tar- 
get concentrations are known for each curve in the ana- 
lysis (a standard curve). All initial conditions of the 
qPCR, other than initial target concentration are 
assumed to be the same for all curves. All curves are 
analyzed simultaneously in a non-linear fit in which the 
estimated rate constants apply to all curves and the ini- 
tial target concentration is held constant at its known 
value for each curve. The fitting process minimizes the 
cost function over all cycles in all curves. The advantage 
of this method is that it pools the information of all the 
curves to give a single best estimate for each model par- 
ameter. The predicted fluorescence values provided by 
the analysis may be compared to the observed fluores- 
cence values to assess how well each qPCR curve is fit 
by the model. The value of this method is that it may be 
used for estimation of initial target concentration for 
samples with unknown initial target concentration pro- 
vided all other conditions of the PGR are the same as 
those used in the standard curve. This is done by doing 
a non-linear fit to the qPCR curve of the unknown sam- 
ple in which only the initial target concentration, So,o > is 
estimated while using the estimated parameters from the 



standard curve analysis as constants. To assess how well 
the models presented here estimate So,o > each qPCR 
curve in the known samples was treated as an unknown 
sample and the estimated Sq^o value compared to the 
known value for Sq^o- 

Dilution curve method (Method B) 

This method may be used when a standard curve is not 
available but a dilution series of the unknown sample is 
available. Here the absolute target concentration in the 
undiluted sample is unknown and denoted So,o,max' The 
absolute target concentration of the i^^ diluted sample is 
^i^o,o,max where di is the dilution ratio for the i^^ sample. 
The qPCR curves from the diluted samples are analyzed 
simultaneously using non-linear curve fitting. Rate con- 
stants apply to all curves in the analysis and a single 
concentration parameter So,o,max is estimated with the 
initial concentration for each sample set at <^iSo,o,max' In 
general usage of this method the estimated So,o,max 
would be the end result. To assess the accuracy of this 
method in this study the known absolute target concen- 
trations for the data sets used here are ignored but their 
ratios are used to obtain values for di. Simultaneous fit- 
ting of all the curves was done as described above to es- 
timate So,o,max and all the rate constants. Next the 
estimated rate constants are treated as constants in a 
second non-linear fit of each qPCR curve separately in 
which an So,o value is estimated for qPCR curve. These 
estimated So,o values are compared to the known values 
to assess the accuracy of method B. 

Simultaneous curves method (Method C) 

This method may be used when two or more samples 
are to be analyzed that may have different initial concen- 
trations and a standard curve is not available nor are any 
dilutions of either sample available. The samples are 
assumed to be subject to qPCR with identical conditions 
except for the fact they may have different initial target 
concentrations. In this method all samples are analyzed 
simultaneously with estimated rate constants applying to 
all samples, but each sample has a unique So,o value to 
be estimated. To assess the accuracy of this method with 
the data sets analyzed here all information on initial tar- 
get concentration is ignored and the estimated So,o 
values for each sample is then compared to the known 
values. 

Separate curves method (Method D) 

This method does not assume any similarity of rate con- 
stants among any of the samples and does not analyze 
samples simultaneously as methods A, B, and C do. Each 
sample is analyzed separately with all parameters esti- 
mated independently for each qPCR curve. Here the ac- 
curacy of method C is assessed by comparing the actual 
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So,o values to the estimated ones. This is the type of ana- 
lysis that is conventionally done when analyzing qPCR 
curves. 

Methods 

Data sets 

Both models 1 and 2 were fit to two different data sets for 
which the actual initial concentrations of target are known. 
The model of Boggy and Woolf (2010) [14] was also fit to 
the same data sets and is called Model 0. The data sets 
used are: 1) Data of Boggy and Woolf (2010) [14], 2) Data 
of Rutledge (2004) [15] and are referred to as data set 1 
and 2, respectively. The data of Boggy and Woolf (2010) 
[14] are for a 129 base pair synthetic target sequence with 
the following conditions for PGR: reaction vol = 25 [A, pri- 
mer concentration = 400 nM each, detection system is 
SYTO-13 dye with concentration = 2 (iM, dNTP 
concentration = 0.2 mM each. Six different absolute target 
concentrations range from 5 x 10^ to 5 x 10^ molecules 
per reaction volume at 10 fold increments. The data of 
Rutledge (2004) [15] is for the 102 base pair K1/K2 tar- 
get sequence with the following conditions for PGR: reac- 
tion vol = 35 [A, primer concentration = 0.25 [iM each, 
detection system is Syber Green dye. Six different absolute 
target concentrations ranged from 4.17 x 10^ to 4.17 x 10^ 
molecules per reaction volume at 10 fold increments. Data 
set 1 has 2 replicate qPGR curves for each target concen- 
tration and data set 2 has 5 replicates for each target 
concentration. 

Determining baseline fluorescence 

Before fitting a model the baseline fluorescence was 
determined for each qPGR curve by fitting the sigmoidal 
function given in equation 8 below. 



1 + exp 



{-^} 



(8) 



The first term on the right side of equation 8 is the 
baseline fluorescence, . The second term is a modified 
sigmoidal function describing the increase in fluores- 
cence due to polymerase chain reaction where F^^^ is 
the maximum fluorescence, Ci/2 is the cycle number at 
which Fyi is midway between and Fyyiax> and ks is a 
scale parameter. Parameters in equation 8 and bounds 
used in the non-linear estimation program are ^F>0, 
Fmax > bF> Ci/2 > 1, and ks > 0. Equation 8 is fit to each en- 
tire qPGR curve by nonlinear estimation of the para- 
meters using least squares as the criterion for fit with 
the PROG NLIN procedure of the SAS program version 
9.3 [16]. Each observation was weighted by the recipro- 
cal of the observed fluorescence. The value of ^F is esti- 
mated separately for each qPGR curve and then treated 
as a constant for that curve when estimating parameters 



of models 0, 1, and 2. In preliminary analyses for this 
study the baseline fluorescence was estimated as a par- 
ameter in the model with each qPGR curve having its 
own unique value. In the resulting fits some qPGR 
curves were found to have baseline values that were not 
consistent with the observed values in the lag phase. 
This is due to the estimation procedure selecting a base- 
line value that gives the best overall fit to the curve ra- 
ther than just in the lag phase. Ruijter et al (2009) [9] 
have shown that baseline estimates can affect the quality 
of fit in the exponential phase for the Gt method and 
this same effect is likely occurring in fits of the models 
here. They also point out inherent problems with esti- 
mating baseline fluorescence from a fixed number of 
points in early cycles. Rutledge and Stewart (2008) [1] 
showed sigmoidal functions similar to equation 8 give 
very good fit to the observed values of qPGR curves. 
The use of the sigmoidal function given in equation 8 is 
found to always give values that fit well to data in the 
lag phase and is not affected much by peculiar changes 
in fluorescence often seen during the first few cycles. 

Estimation of model parameters 

Methods described here are used to determine how well 
each model explains the shape of each qPGR curve and 
also to determine how accurately each model estimates 
initial target concentration for each of the two data sets. 
Each model, 0, 1 or 2, was fit to qPGR data by nonlinear 
curve fitting with PROG NLIN of SAS version 9.3 [16] 
using the Marquardt optimization method and the mean 
sum of squared differences between the observed and 
predicted fluorescence values (MSresidual) as the meas- 
ure of goodness-of-fit. Mean Square Residual (MSresi- 
dual) is a measure of how well a model explains the 
qPGR curve. A MSresidual value near zero indicates a 
very good fit in which the model predicted values are all 
very close to the observed values and higher values indi- 
cate a poorer fit. Up to 10,000 iterations were used for 
each fit. The parameters estimated and bounds imposed 
during estimation are: So,o > 0> k > 0 for model 0, 5^,0 > 0 , 
Kf >0, Ks > Kdi2 >0 for model 1, and So,o >0 , /</ >0 , and 
K >0 for model 2. The initial concentrations of primers 
were set to the values reported by the authors providing 
the two data sets and expressed as nmoles/L. Additional 
file 4 contains SAS code that fits model 1 and 2 to a 
sample data set. 

Effect of varying L 

To determine the effect of the choice of the value of L 
on the results of the analysis, the entire analysis was 
done separately for each of a range of values for L. Spe- 
cifically, each of the three models (0, 1, 2) was fit using 
each of the estimation methods (A, B, G, D) to each of 
the two data sets (1, 2) for each of the L values of 0.1, 
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0.2, 03, 0.4, 0.5, 0.6, 0.7, 0.8, 0.9, 0.95, 0.98. The MSresi- 
dual and Fold Error for each fit were averaged over 
replicates 1 and 2 for each data set and the resulting 
means plotted versus L for each dilution separately. 

Effect of errors in estimation of baseline fluorescence 

Analyses were done to assess the effect of errors in the 
estimation of baseline fluorescence, bF> on the goodness 
of fit of the model and the estimated initial target concen- 
tration. The estimation procedure used to estimate 
provides the standard error of the estimated value. Fits of 
the models and estimation of initial target concentration 
was done using bF values 1 and 2 standard deviation units 
above and below the estimated value. Plots of the mean 
MSresidual and Fold Error in estimation of initial target 
concentration versus the bF value used were used to as- 
sess the effect of errors in estimation of bF. 

Results 

For each of the three models (0,1,2), each of four 
methods for parameter estimation were used for each 
data set. Here every model and estimation method was 
used to fit the same data and MSresidual is used to 
compare models and methods in their goodness of fit 
to the data. A value of MSresidual was computed for 
every qPCR curve for each model and for each estima- 
tion method. Plots of MSresidual versus logio(To) are 
given in Figure 4 for data set 1 and Figure 5 for data 
set 2. Data set 1 has two replicate MSresidual values 
for each target concentration and data set 2 has 5. 
These figures show that for both data sets and for aU 
4 methods of estimation, models 1 and 2 are very 
similar in fit to qPCR curves and both give better fit 
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Figure 5 Goodness of fit to qPCR curves for data set 2 with 
L = 0.8. Plots are MSresidual vs known initial logio target 
concentration, To, in nM/L Panels A,B,C and D contain plots for the 
Standard curve. Dilution curve. Simultaneous curves, and Separate 
curves estimation methods, respectively. Plot symbols are: blue 
circle = model 0, red plus = model 1, green X = model 2. 



than model 0. The poorer fit for model 0 is always 
due to the predicted fluorescence being too high for 
low cycle numbers and too low for high cycle num- 
bers. This pattern of lack of fit for model 0 is shown 
in Figure 6 which shows the fits for each model and 
estimation method for the first replicate of the sample 
with 5*10^ molecules in data set 1 and is representa- 
tive of the fit found for all samples in both data sets. 
Note that in Figure 6, Models 1 and 2 fit the data well 
for all estimation methods (A,B,C,D) while Model 0 
gives poor fit with methods A and B, moderately good 
fit with method C, and good fit only with method D. 
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Figure 4 Goodness of fit to qPCR curves for data set 1 with 
L = 0.8. Plots are MSresidual vs known initial logio target 
concentration, Tq, in nM/L Panels A,B,C and D contain plots for the 
Standard curve. Dilution curve. Simultaneous curves, and Separate 
curves estimation methods, respectively. Plot symbols are: blue 
circle = model 0, red plus = model 1, green X = model 2. 
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Figure 6 Fit of models to qPCR curves for each model and each 
estimation method for data set ^, rep = 1, dilution = 1E-2, and 
L = 0.8. Plot symbols for predicted fluorescence are:blue 
circle = model 0, red plus = model 1 , green X = model 2. The plot 
symbol for observed fluorescence is delta (6). 
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Estimation of absolute target concentration 

The estimates of initial target concentrations and the 
known initial target concentrations are denoted So,o 
and To, respectively. Plots of logio(mean So,o) vs logio 
(To) for each model and estimation method are given 
in Figure 7 for data set 1 and Figure 8 for data set 2. 
Here means are over replicates for each sample. 
Fi gures 7 and 8 indicate all three models give So o 
values that have a good linear relation to the true 
values, To, for estimation methods A and B with both 
data sets. For estimation methods C and D the linear 
relation between So,o and To holds well for all three 
models when applied to data set 1 ,however, for data 
set 2 model 1 and 2 are very linear, but model 0 gives 
non-linear results. The high linearity of these log-log 
plots is not a good indication of the accuracy of the es- 
timation because the estimates can be linear but at the 
same time be biased. A measure of the accuracy of the 
estimation is the Fold Error, which is the ratio of the 
estimated value to the true value (So/ To). Fold error is 
computed for each model and estimation method and 
given in Figure 9 for data set 1 and Figure 10 for data 
set 2. Figures 9 and 10 show that for both data sets es- 
timation methods A and B give similar results and 
model 1 gives more accurate estimation of To than the 
other two models. Estimation method C generally gave 
less accurate estimates than methods A and B and 
model 2 performed best. When using estimation 
method D, again model 2 gave the most accurate esti- 
mates over both data sets and model 0 was the least 
accurate for both data sets. 
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Figure 7 Relation between logio(So_Mean) and iogio(To) for 
analysis of data set 1 with L = 0.8. loglOSo_Mean denotes base 10 
log of mean of So for the two replicates and the solid line denotes 
logio(So_Mean) = logio(Jo) which indicates perfect estimation. Panels 
A,B,C and D contain plots for the Standard curve, Dilution curve, 
Simultaneous curves, and Separate curves estimation methods, 
respectively. Plot symbols are: blue circle = model 0, red 
plus = model 1, green X = model 2. 
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Figure 8 Relation between logio(So) and logio(To) for analysis 
of data set 2 with L = 0.8. loglOSo_Mean denotes base 10 log of 
mean of So for the 5 replicates and the solid line denotes logio 
(So_Mean) = logio(To) which indicates perfect estimation. Panels A,B, 
and D contain plots for the Standard curve. Dilution curve. 
Simultaneous curves, and Separate curves estimation methods, 
respectively. Plot symbols are: blue circle = model 0, red 
plus = model 1, green X = model 2. 



Effect of varying L 

The plots of the Mean MS residual vs L and Mean log 
(Fold Error) vs L for the samples with dilution factor 0.1 
in data set 2 are given in Figures 11 and 12, respectively. 
This plot as well as plots for all dilutions in both data 
set 1 and 2 are given in Additional file 5. The plots given 
in Figures 11 and 12 were chosen because they show 
patterns which are representative of most of the plots 
given in Additional file 5. 
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Figure 9 Relation between Fold Error and logio(To) for analysis 
of data set 1 with L = 0.8. Fold Error is the ratio SO^O and a value 
of one indicates perfect estimation. Panels A,B,C and D contain plots 
for the Standard curve. Dilution curve. Simultaneous curves, and 
Separate curves estimation methods, respectively. Plot symbols are: 
blue circle = model 0, red plus = model 1 , green X = model 2. 
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Figure 10 Relation between Fold Error and logio(To) for 
analysis of data set 2 with L = 0.8. Fold Error is the ratio SO^O and 
a value of one indicates perfect estimation. Panels A,B,C and D 
contain plots for the Standard curve, Dilution curve, Simultaneous 
curves, and Separate curves estimation methods, respectively. Plot 
symbols are: blue circle = model 0, red plus = model 1, green 
X = model 2. 
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Figure 12 Relation between logio(Fold Error) and L for analysis 
of data set 2 with a range of L values. loglOSo_Mean denotes 
base 1 0 log of mean of Sq for the replicates. A value of 0 for logio 
(Fold Error) indicates perfect estimation. Panels A,B,C and D contain 
plots for the Standard curve. Dilution curve. Simultaneous curves, 
and Separate curves estimation methods, respectively. Plot symbols 
are: blue circle = model 0, red plus = model 1 , green X = model 2. 



Effect of errors in estimation of baseline fluorescence 

Plots of mean MSresidual and fold error in estimated 
initial target concentration versus baseline fluorescence 
for estimation methods A, B, and D are affected very lit- 
tle by variation in the value of used (plots not 
reported here). When using estimation method C, the 
MSresidual values were not affected much by variations 
in the bF but the initial target concentrations were over- 
estimated when using low values of bF. Increases in Fold 
Error by a factor of up to 10 was found in some plots 
when a bF value two standard error units below the esti- 
mated value was used. 
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Figure 1 1 Goodness of fit to qPCR curves for analysis of data 
set 2 with a range of L values. Plots are mean MSresidual vs the 

value of L used in the analysis. Panels A,B,C and D contain plots for 
the Standard curve. Dilution curve. Simultaneous curves, and 
Separate curves estimation methods, respectively. Plot symbols are: 
blue circle = model 0, red plus = model 1 , green X = model 2. 



Discussion 

Fit to qPCR curves 

The two models presented here have both been shown 
to give good fit to the lag and exponential phase of 
qPCR curves over a wide range of initial target concen- 
trations while using a single set of rate constants. The 
lack of fit of model 0 to qPCR curves seen in Figures 4, 
5 and 6 is likely to be due to the same cause as the vari- 
ation in k values pointed out by Boggy and Woolf (2010) 
[14] . The derivation of the Boggy and Woolf [14] for- 
mula assumes their parameter k = ka/kai2 where ka and 
kai2 are the rate constants of model 2 (Figure 3). Thus 
the same value of k should apply to all qPCR curves that 
are identical except for initial target concentration. I 
have shown that the parameter k of the Boggy and 
Woolf (2010) [14] model is not constant but depends on 
the primer concentration during the PGR process. Esti- 
mation methods A, B, and C force model 0 to use the 
same k value for all curves and thus each curve gives a 
poorer fit than it would if allowed to have a unique k 
value for each curve. These results suggest the Boggy 
and Woolf (2010) [14] assumption that primer concen- 
tration is sufficiently large that it can be ignored is not 
met in these data sets. 

Estimation of Tq 

The results here indicate the best methods for estima- 
tion of target concentration in an unknown sample is to 
use a standard curve that has a range of values for initial 
target concentration or to use a dilution curve com- 
posed of a series of dilutions of the original sample and 
analyze the qPCR curve with estimation methods A and 



Cobbs BMC Bioinformatics 2012, 13:203 
http://www.bionnedcentral.conn/1471 -21 05/1 3/203 



Page 11 of 13 



B, respectively. Figures 4, 5, 7, 8, 9, and 10 indicate these 
two methods (A and B) are essentially equivalent in their 
accuracy of estimation. Note that with the dilution curve 
method, B, the absolute initial concentrations of target 
are estimated independently of the known values. Only 
the dilution ratios are used in the simultaneous fit to the 
data. The best way to estimate target concentration of 
an unknown sample when a standard curve is not avail- 
able is to make a series of dilutions of the sample and 
then apply qPCR to each dilution and use the dilution 
curve method described here to estimate the target con- 
centration of the undiluted sample. The separate curves 
method, D, worked well for data set 1 for both models 1 
and 2, and worked moderately well only for model 1 for 
data set 2. Lastly, estimation method C gave good esti- 
mates for data set 2 but poorer estimates for data set 1. 
More sources of data need to be studied to determine 
the suitability of estimation methods C and D with these 
models. 

The poor fit of the Boggy and Woolf (2010) [14] 
model to qPCR curves with estimation methods A, B, 
and C suggest it will give poorer estimates of initial tar- 
get concentration than models giving good fit. Estima- 
tion method D allows each curve to have unique 
parameter values and gives much better fit to qPCR 
curves when using model 0, however, the accuracy of 
estimates is not as good as those for models 1 and 2. 
The non-linearity of initial target concentration esti- 
mates for Model 0 with method D using data set 2 as 
seen in Figure 8 panels C and D data is different than 
the result reported by Boggy and Woolf (2010) [14] who 
obtained a very linear relationship using method D. The 
reason for this difference is not clear but may be due to 
the fact that model fitting is done differently. Boggy and 
Woolf (2010) [14] estimated baseline fluorescence as a 
parameter in the model. I estimate baseline fluorescence 
before fitting the model and treat it as a constant during 
model fitting. Other minor differences in the programs 
used for fitting and the searching routine may have also 
caused some difference in the result. 

Effect of varying L 

Figures 11 and 12 as well as those given in Additional 
file 5 indicate the effects of variation of L depend on the 
estimation method used and to a lesser extent on the 
concentration of initial target sequence. Examination of 
the plots in Additional file 5 indicate the effects of L can 
be variable but some general patterns are present which 
are shown in Figures 11 and 12. First, for all models, all 
methods and all dilutions, MSresidual increases with in- 
creasing L. All the plots involve a log scale and show an 
approximate linear increase of MSresidual with L indi- 
cating an underlying exponential increase of MSresidual 
with L. Thus when L has values of 0.9 and higher the fits 



to qPCR curves are much worse than with lower values 
of L. Secondly, the effect of L on the accuracy of the 
estimated initial target concentration depends strongly 
on the estimation method used. Estimated target con- 
centrations obtained with estimation methods A and B 
show remarkably little dependence on the value of L 
used for either of the data sets analyzed. There is a con- 
sistent trend for a slight over-estimation of target con- 
centration for L values above 0.8 with the amount of 
over-estimation increasing with increasing L. The de- 
pendency of the accuracy of estimation by methods C 
and D on L was more variable, however, both methods 
showed a patterns similar to that present in methods A 
and B, but more extreme over-estimation of initial target 
concentration when L values above 0.8 are used. For 
these reasons the value of 0.8 was used for L in the ana- 
lyses presented here. Though the use of L = 0.8 gave 
good fits to qPCR curves and accurate estimates of ini- 
tial target concentration, investigation of other methods 
for restricting analysis to only parts of the qPCR curve 
may be worthwhile. 

Effect of errors in estimation of baseline fluorescence 

Estimation methods A, B and D are robust to errors in 
baseline fluorescence. Estimation method C is less 
robust to such errors. 

General comments 

The models presented here assume the annealing of 
single-stranded DNA to double-stranded molecules 
reaches equilibrium at each cycle. This assumption may 
not be true, however, the ability of the present models to 
fit qPCR curves as well as they do suggests it may be ap- 
proximately true. Model 1 gave slightly better fits to 
qPCR curves and slightly better estimates of initial con- 
centration than Model 2 when estimation methods A 
and B were used. This result suggests the reversible 
reactions assumed in model 1 (Figure 2) may be a better 
model than the non-reversible reactions assumed in 
model 2 (Figure 3). It is thought the dissociation con- 
stant for double-stranded templates is so small that the 
annealing of templates is effectively non-reversible [8]. 
However, the annealing of primer and template may pro- 
duce a much less stable double-stranded structure such 
that the reversible reaction model is more reasonable. A 
model in which template-template molecules are com- 
pletely stable and primer-template molecules are not is 
not useful when considering equilibrium solutions, as 
equilibrium then occurs only when the concentration of 
primer-template molecule is zero. To use such a model, 
a non-equilibrium solution would need to be found. It is 
also possible that there is not enough time allowed dur- 
ing the experiments for the re-association reactions to 
reach equilibrium. Whether or not DNA re-association 
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achieves equilibrium or not during PGR is not clear. 
Smith et al, (2007) [13] offer some analysis suggesting 
that equilibrium does occur. 

An advantage of the mechanistic approach to model- 
ing qPCR curves is that models describe actual events of 
the process and thus may be expanded to include other 
effects that may improve accuracy. Kinetic models natur- 
ally allow simultaneous estimates of common para- 
meters which will increase accuracy. This is extremely 
difficult or impossible with other methods used for mod- 
eling qPCR. Because the kinetic models account for vari- 
able efficiency by kinetic theory they may give better 
estimates of initial target concentration than other 
approaches. Another advantage of kinetic models is that 
parameters of the model may be estimated by dedicated 
experiments distinct from the qPCR curve experiment. 
For example, the parameter Kj present in both models 1 
and 2 converts double- stranded DNA concentration into 
amount of fluorescence for a particular experimental 
system. The value of Kf could be determined in experi- 
ments separate from qPCR and then used as a constant 
in the analysis of a qPCR curve, thus increasing the ac- 
curacy of estimation of the kinetic parameters and target 
concentrations. Lastly, kinetic models uniquely allow es- 
timation of absolute initial concentrations of target se- 
quence without use of a standard curve of any type. The 
accuracy of estimation with kinetic models is enhanced 
greatly by the use of a standard curve, but it is not 
required. In fact the dilution curve method gave fits es- 
sentially as good as the standard curve method. The 
mechanistic models developed here offer a more 
complete description of the amplification occurring in 
qPCR, fit observed data very well, and allow more accur- 
ate estimation of initial target concentration than other 
methods. 

Conclusions 

Two stepwise kinetic equilibrium models of qPCR are 
presented and analytical solutions are given for equilib- 
rium values during annealing of single stranded to 
double stranded molecules. The models are amenable to 
different types of non-linear fitting which include fitting 
several curves simultaneously when they have common 
parameter values. Both models are shown to give very 
good fit to qPCR data with a wide range of initial target 
concentrations with a single set of values for rate con- 
stants. The two models also give accurate estimates of 
initial absolute target concentration using several differ- 
ent methods for estimation. Using the models with data 
from a standard curve gives accurate estimates of initial 
absolute target concentration. In the absence of a stand- 
ard curve, a dilution curve method also provided accur- 
ate estimates of the initial absolute target concentration. 
In the absence of either a standard or dilution curve the 



models provide estimates, though less accurate, of abso- 
lute initial target concentration. These models presently 
give the best unified theory for the interpretation of 
qPCR data in that they explain well the shape of the 
qPCR curve and how it is affected by variation in the ini- 
tial target concentration. 
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